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SUMMARY 


Sequences encoding the N protein of the bovine enteritic coronavirus-F15. 
strain (BECV-F15) have been cloned in PBR322 plasmid using cDNA pro- 
duced by priming with oligo-dT on purified viral genomic RNA. Some 265 
insert-containing clones were studied. Hybridization of these inserts with po- 
ly(A)* RNA extracted from infected cells led to the conclusion that they were 
located at the 3’-end of the genome. 


After subcloning in M13 phage DNA, clones were sequenced by the Sanger 
technique. A 1,710-nucleotide sequence corresponding to the gene coding for 
the viral N-protein was established. It shows 2 overlapping open reading frames 
(ORF). The 3’-non-coding end of the gene has an 8-nucleotide sequence in 
common with the homologous genome areas of MHV, TGE and IBV viruses. 
This sequence may represent the polymerase RNA binding site. 


An upstream sequence surrounding the first AUG of the smaller ORF cor- 
responds to a potentially functional initiation codon. The sequence of the 
primary translation product deduced from the DNA sequence predicts a 
polypeptide of 207 amino acids (22.9 Kd) with a high leucine (19.8 %) con- 
tent, possessing a hydrophobic N-terminal end. 


Received December 4, 1987. 


(1) Present address: Laboratoire Central de Recherches Vétérinaires, 22 rue Pierre Curie, BP 67, 
94703 Maisons-Alfort Cedex. 
(2) To whom all correspondence should be addressed. 


124 C. CRUCIERE AND J. LAPORTE 


The larger ORF has a coding capacity of 448 amino acids (49.4 Kd), 
corresponding to the N-protein molecular weight. The deduced protein 
possesses 43 serine residues (9.6 % of the total amino acid content) which 
may be phosporylated and involved in N-protein/RNA binding. N-protein 
also has 5 regions with a high basic amino acid content. One of them 
is also serine-rich and has a strong homology site with MHV, TGE and 
IBV viruses. In the first part of the N-terminal, a 12-amino-acid sequence 
(PRWYFYYLGTGP) is highly conserved for BECV-F15, JHM, TGE and IBV 
viruses. BCV Mebus strain and BECV-F15 have only minor differences in 
their N-protein sequence. 


KEyY-worDs: Coronavirus, Protein, Nucleocapside, Genome; BECV-F15 
strain, N-protein sequence. 


INTRODUCTION 


Bovine enteritic coronavirus (BECV) belongs to the monogeneric Co- 
ronaviridae family having the avian infectious bronchitis virus as type species. 
They are pleiomorphic, enveloped, surrounded by a fringe of « club-shaped » 
spikes looking like a corona in the electron-microscope and giving the name 
to the family. The viral genome is a positive single-stranded RNA of appro- 
ximately 18 to 20 kb, its 3’-end is polyadenylated [19, 22]. This genome codes 
for the viral proteins which are nucleocapsid (N), membrane (E1), spikes (E2) 
and several non-structural proteins. They are translated from a 3’-end co- 
terminal nested set of mRNA, each also having a common 5’-leader sequence 
[8]. Only the unique 5’-terminal sequence, not present in the next smaller RNA 
of the set, is translated. 


It was recently established that, in fact, BECV contains 4 main structural 
proteins: the nucleoprotein N (50 Kd), the transmembrane E1 glycoprotein 
(28 Kd) and 3 peplomer glycoproteins E2, gp105 and gp95. The haemagglutinin 
protein E2 (125 Kd) is cleaved by reducing agents into 2 subunits having 
molecular weights of 65 Kd; the main neutralizing epitopes of the viral par- 
ticle are located on gp105 (105 Kd) [9, 24, 6]; the structure of gp95 (95 Kd) 
is not clearly established. 


The BECV induces very severe, often fatal, diarrhoea in young calves. It 
was described for the first time in the United States of America [13]; we have 
been able to isolate such a virus in the faeces of diarrhoeic calves in France 
and to experimentally reproduce the disease [4]. These 2 strains of BECV are 
distinguishable by using monoclonal antibodies [23]. 


BECV = bovine enteritic coronavirus. N 
BSA = bovine serum albumin. ORF 
FCS = foetal calf serum. : 


nucleocapsid. 
open-reading frame. 
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Vaccines produced from cell culture of attenuated or inactivated BECV 
are not totally protective and they necessitate production of large volumes 
of viral suspension because of the low infectious titre obtained in authorized 
cell lines. For these reasons, we have started cloning and sequencing the French 
F15 strain of BECV to try and produce cheaper and more efficient vaccines 
by genetic engineering or by oligopeptidic synthesis. 


MATERIALS AND METHODS 
Cell culture and virus production. 


HRT 18 cells (human rectal tumour cell line) were grown in RPMI-1640 medium 
containing 15 % foetal calf serum (FCS) [10] except that tylosine (10 ug/ml) and 
lincomycine (200 ug/ml) were added to the medium instead of penicillin and strep- 
tomycin. 


Bovine enteritic coronavirus F15 strain (BECV-F15) was isolated from diarrhoeic 
calf faeces, then directly adapted on HRT18 cells [10] and plaque-purified. It was 
grown as previously described [4]. Infectious titres reached 5 x 105 plaque-forming 
unit (PFU)/ml. 


Virus purification. 


After freezing and thawing of infected cells together with supernatant and then 
clarification, the virus was purified by 2 ultracentrifugation steps (velocity then isopyc- 
nic) [9]. 


Genomic RNA purification. 


A 1-ml sample of purified virus suspension in distilled water was added to the 
same volume of 2-fold concentrated TNE buffer (20 mM pH 8 Tris-HCl, 200 mM 
NaCl, 2 mM EDTA) containing 400 yg of proteinase K. After incubation for 30 min 
at 37°C, then for 5 min at 50°C, a same volume of the same buffer containing 2 % 
SDS was added and incubation carried on for 30 min at 25°C. 


Genomic RNA was phenol/chloroform-extracted, then precipitated in 2.5 volumes 
of 0.25 M sodium acetate in ethanol. After one night at — 20°C, RNA suspension 
was centrifuged for 20 min at 10,000 g, the pellet washed with 75 % ethanol, dried 
and dissolved in minimal volume of distilled water. One optical density (OD) unit 
at 260 nm corresponded to 40 ug/ml of single-stranded RNA [12]. 


cDNA cloning. 


The synthesis of cDNA complementary to the 3’-end of the BECV-F15 genome 
was carried out in a volume of 52 yl: 10 wg in 10 wl of BECV RNA, denatured at 
65°C for 5 min and quickly chilled in an ice bath, were added to 42 yl of 100 mM 
pH 8.3 Tris-HCI at 42°C containing 100mM KCl, 100mM MgCl, 10 mM 
dithiothreitol, 4 ug actinomycin D, 500 uM each of the 4 dNTP, 75 units RNasin, 
140 units reverse transcriptase (P.H. Stehelin), and as primer, 10 yg oligo-dT. In- 
cubation was performed for 2 h at 42°C and the reaction was stopped by adding 
2 wl 500 mM EDTA. Reaction products were extracted with phenol/chloroform, 
chloroform and ethanol precipitation. Free RNA strands non-hybridized with cDNA 
were digested with endonuclease T, [25]; these digests and free nucleotides were 
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removed by gel filtration on a spun column of « Sephadex-G50» medium (Pharmacia) 
[12 


The RNA-cDNA heteroduplexes were then poly-dC tailed: 2 pmoles of 3’ ends 
were dissolved in 20 ul of 25 mM Tris-HCl buffer pH 7 containing 100 mM K- 
cacodylate, 0.2mM DTT, 1mM CoCl, 0.2mM dCTP, 50 yg bovine serum 
albumin (B.R.L.), 13.5 units of terminal-deoxynucleotidyl transferase (B.R.L.) and 
30 uCi «-32P-dCTP (3,000 Ci/mmole). The reaction was carried out at 37°C for 
3 min and stopped by adding 2 pl 500 mM EDTA [16]. The product was phenol/ 
chloroform-extracted. An average of 20 dC/3’-end of heteroduplex was obtained. 


C-tailed heteroduplexes were annealed to dG-tailed Ps/I-linearized PBR322 plasmid 
(1 mole for 2 moles), in a volume where the plasmid was at a concentration of 5 ng/l 
at 65°C for 10 min. Competent RR1 Escherichia coli cells were transfected with this 
material [5]. The total DNA concentration was 0.25 ug/ml. 


Identification of specific BECV inserts. 


E. coli cells were grown overnight in a medium containing 12 pg/ml tetracycline, 
then treated by alkaline lysis [12]. Plasmidic DNA was extracted by phenol/chloroform 
treatment and ethanol-precipitated. DNA inserts were removed by Ps/I restriction 
enzyme: 1.2 yl of 10-fold concentrated buffer (100 mM pH 7.5 Tris-HCl, 1 M NaCl, 
100 mM MgCl,, 1 mg/ml BSA) and 2 units of Pst] enzyme (B.R.L.) were added to 
10 ul of plasmidic DNA solution. Insert size was established by electrophoretic migra- 
pen in 1 % agarose gels in TBE buffer (89 mM Tris, 89 mM boric acid, 2 mM 

TA). 


Probes were prepared by nick-translation in a 20 pl volume containing 0.5 yg 
DNA, 2 pl of 10-fold concentrated buffer (500 mM pH 7.2 Tris-HCl, 100 mM 
MgSO,, 1 mM DTT, 500 pg/ml BSA), 20 uM each of the 4 dNTP, 2.5 ng pancreatic 
DNase I (Boehringer), 40 pCi «-32P-dCTP (800 Ci/mmole) and 0.8 unit DNA 
polymerase I. Mixture was incubated for 2 h at 16°C. Reaction was stopped by ad- 
ding 3 ul 500 mM EDTA pH 8. Free nucleotides were removed by filtration through 
a spun column. 


Northern and Southern blots were performed as described by Maniatis [12]. Pro- 
bes were incubated for hybridization overnight at 42°C (Southern) or at 55°C (Nor- 
thern); blots were then washed in low salt concentration solutions: three times for 
15 min in 0.1 % SDS, 2 x SSC and twice for 15 min in 0.1 % SDS, x0.1 SSC at 52°C. 


DNA sequencing and sequence analysis. 


M13 dideoxy sequencing was carried out according to the Sanger technique [17], 
using «-35S-dATP (New England Nuclear). In short, the main steps were the 
following: 


DNA replicative forms of mp18 or mp19 M13 phage were prepared [3]; they 
possess polylinkers with single cleavage sites for EcoRI, SacI, KpnI, Smal, BamHI, 
Sall, PstI, Sphl and HindIII restriction enzymes. 


Viral cDNA inserts were extracted from PBR322 plasmid and treated by restric- 
tion enzymes having sites in the M13 polylinker. DNA fragments ranging between 
300 and 500 bases were purified by electrophoresis in low melting point agarose (Gibco- 
BRL) gel. M13 phage DNA was cleaved by the same enzymes and 5’ end phosphates 
removed by alkaline phosphatase (Boehring) treatment [12]. DNA were then 
phenol/chloroform-extracted and ethanol-precipitated. After ligation of the insert 
in the vector, performed with 50 ng of insert in a molar ratio of 3/1 TGI, E. coli 
competent cells were transfected [5]. 
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TGI1 recombinant clones were selected in a IPTG- and X-gal-containing medium. 
White plaques were then checked by hybrization with insert radioactive probe. 


Sequencing was then performed using a primer complementary to the 3’-end of 
the DNA strand to be transcribed. These primers were synthesized in an automated 
DNA synthesizer (Biosearch 8600). 


Sequence data were analysed and assembled with the aid of the program of Queen 
and Korn [14] of the « Beckman Microgenie» program (March 1985, version Beckman 
Instruments, Inc.) adapted to the «IBM PC-XT» microcomputer. 


RESULTS 
cDNA cloning. 


Starting material for cDNA synthesis was 10 yg of purified and temperature- 
denatured viral RNA. When analysed by electrophoresis in alkaline agarose 
gels, the sizes of the cDNA obtained using oligo-dT as a primer ranged bet- 
ween 1.3 and 6.0 Kb. After binding of heteroduplexes to PBR322, this con- 
struction was transfected into E. co/i-competent cells and we obtained 2 x 105 
clones/ug of PBR322. 


Some 265 colonies containing 0.3- to 2.0-Kb inserts were studied. Inserts 
of a larger size than 0.5 Kb very often showed an internal PsfI site (results 
not shown). Their viral specificity was checked, after nick-translation 32P- 
labelling, by hybridization with purified genomic viral RNA or cellular RNA 
(fig. 1). Viral-specific inserts ‘were further used for characterization of other 
inserts. 


Insert orientation was established by hybridization with inserts having no 
PstI site and by restriction endonuclease mapping with enzymes having no 
or only one cleavage site in PBR322 plasmid. 


The location of the insert along the viral genome was determined by Nor- 
thern blot analysis: full length or purified products of insert restriction cleavage 
were hybridized with poly(A)* RNA extracted from infected or non-infected 
cells. Before hybridization these RNA were electrophoresed in hydroxymethyl 
Hg-containing agarose gel. Under these experimental conditions, 8 viral-specific 
poly(A)* messenger RNA bands were resolved (J. Laporte and C. Cruciere; 
to be published). They form a specific RNA-nested set as established for other 
coronaviruses. All the inserts we obtained hybridized with the 8 viral RNA 
bands (results not shown); they were complementary to the 3’ end of the viral 
genome. 


Figure 2 presents the schematic location of the inserts we have studied. 
The 1.6 insert has a 2,000-nucleotide size and the 5’-end of insert 2.56 is 
presumably 2,400 nucleotides from the 3’-end of the viral genome. As deduced 
from the sizes of N and E1 viral proteins, they should cover the whole length 
of the N gene (1,700 nucleotides) and the beginning of the El 5’-adjacent 
gene (320 nucleotides). 
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Fic. 1. — Screening of insert virus specificity. 


Radioactive probes were prepared from insert-containing PBR322 plasmid. These probes were 
hybridized on nitrocellulose sheets with dots of RNA extracted from non-infected (C) or 
BECV-F15-infected (V) HRT18 cells. Hybridization was checked by autoradiography. In 
the experiment shown, inserts 1.6, 1.22 and 2.56 were clearly virus-specific. 


cDNA sequencing. 


As mentioned above, 400-bp fragments of the cDNA clones were subcloned 
in mp18 or mp19 M13 phage DNA. Their nucleotidic sequences were deter- 
mined by sequencing both M13 DNA strands or by multiple sequencing of 
one strand. We have been able to establish a 1,710-nucleotide sequence from 
the 3’-end of the genome (fig. 3). This sequence has 2 overlapping open-reading 
frames (ORF). The main ORF stretches from nucleotide 74 to nucleotide 1,416, 


BECV-F15 CORONAVIRUS N PROTEIN SEQUENCE 129 


BamHI PstI SacI SphI Pvull PstI 
ee De te ie oe 

ie "8 ie 1.95 

rot : i179 

i > 

: i———__—__— 14 

ig b——— } 15 

a | 1.6 

a eae? 18 

Pep 3 

i bo 

it i———_———— 1.19 

ee Nie eS 1.20 

PS pone 42 

Be, pee es 1.26 

hog —————_—_—___ 131 

Pb Pete 4 138 

: : : ao 1.39 

ee Tipe ae a 

Sp pea ee SA 

a re | 1.56 

a BI 

: 2.56 


Fic. 2. — Arrangement of some of the cDNA clones obtained using oligo-dT as primer. 


the smaller one from nucleotide 135 to nucleotide 755 (fig. 3). The first has 
a coding capacity for a 448-amino-acid protein, the second for a 207-amino- 


acid protcin (fig. 4). 


DISCUSSION 


We have determined, by cDNA cloning of BECV-F15 genomic RNA using 
an oligo-dT primer, a sequence of 1,710 nucleotides. 

We assume that this sequence comprises the nucleocapsid protein gene se- 
quence. 

For every coronavirus so far studied, the gene coding for the N protein 
is located at the 3’-end of the viral genome. The same conclusion arises from 
our studies on the BECV-F15 poly(A)+ RNA (to be published). 
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The largest ORF has a 1,344-nucleotide length and encodes for a 448-amino- 
acid protein with a molecular weight of 49.4 Kd. Our previous results [4] had 
shown a 50-Kd molecular weight N protein. 


Recently [11] it was described for the US Mebus strain of the related bovine 
corona virus (BCV), that the N protein gene was at the 3’-end of the viral 
genome. 


Open-reading frames. 


Main ORF. — The distance between the first AUG following the initia- 
tion codon and this initiation codon is 693 nucleotides. When we compared 
the sequence around the initiation codon to homologous sequences of dif- 
ferent strains of MHV we found the same CTAAAC sequence upstream of 
the initiation AUG. 


Secondary ORF. — The consensus sequence GUAAUGGC surrounding 
its initiation codon is one of optimal environment for starting mRNA transla- 
tion [7]. Bunyaviruses and adenoviruses express 2 different proteins from only 
one gene by having 2 overlapping ORF [7]. So, we cannot exclude the transla- 
tion of a protein from the secondary ORF. Its predicted molecular weight 
is 22.9 Kd for 207 amino acids. This protein has a rather high leucine con- 
tent: 19.8 % compared to 5S % for the N protein. Furthermore, its N-terminal 
end is hydrophobic and is a potential membrane anchor region. Genes presen- 
ting 2 different ORF are also described for other coronaviruses: mRNAs of 
JHMvirus [20], mRNAp of IBV [2] and N protein mRNA of the Mebus BCV 
strain [11]. 


Non-coding 3’-end. 


This part of the genome may play an important role during the genomic 
RNA trancription to the complementary minus RNA strand. Sequence 
homology between BECV-F15 and MHV for the last 100 nucleotides of the 
coding part is only 59 %, but homology increases to 75 % for the 3’-non- 
coding end. A 10-nucleotide sequence (GGGAAGAGCT) was found in com- 
mon at the same place of this gene area for MHV and IBV viruses [2] (fig. 5). 
We find an identical sequence (except the last T) for BECV-F15 virus be- 
tween nucleotides 1,631 and 1,640. When looking at the GETV genome se- 


Fic. 3. — Nucleotidic sequence of the 3’-end of BECV-FI5 genome. 


This 1,710-nucleotide sequence has 2 large overlapping ORF. M = potential Hapeleuee initia- 
tion codons; U =translation stop codons. —=main ORF; ---=secondary OR 
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quence [15], we observe the same sequence (except the first G) in the 3’-non- 
coding end between nucleotides 1,923 and 1,931. Our analysis strengthens 
Bournsell’s hypothesis ; this sequence, well conserved among the coronaviridae 
family, should have an important function during RNA replication as it is 
an RNA-polymerase fixation site. 


N-protein-predicted amino acid sequence. 


BECV-F15 N protein has very strong homology with the same protein of 
JHM virus (70.3 %) (fig. 6) and only 25.2 % and 24.1 %, respectively, with 
TGE and IBV virus N proteins. These coronavirus N proteins are 
phosphorylated on their serine residues [18]. Our results show 43 serine residues 
in BECV-F15 nucleocapsid protein (9.6 % of the total amino acids). For this 
virus and for JHM, TGE and IBV viruses we find 2 main areas where serine 
residues are clustered. For BECV-F15 and JHM viruses they are in homologous 
areas (nucleotides 9 to 19 and nucleotides 191 to 220) of low overall homology 
(58 % and 53 %). One serine cluster is common to the 4 viruses. This fact 
is striking because of the low sequence homology between these viruses. 


It was previously established [21, 1] that N protein genomic RNA binding 
sites are located in the basic portions of the protein. For the complete se- 
quence there is an excess of 19 basic residues compared to acidic residues. 
There are 5 basic-rich regions which are found in homologous areas of MHV, 
TGE and IBV viruses. Concerning BECV-F15 and MHV, 4 of these areas 
have 90 % homology. The fifth has only 60 % homology but is also serine- 
rich and possesses a sequence in common with TGE and IBV viruses (amino 
acids 193 to 222). It may have a more specific function in protein/RNA 
recognition. 

We also observed a strong sequence homology, not yet described, in the 


first part of the N-terminal end of the N proteins of BECV-F15, BCV, MHV, 
TGEV and IBV viruses: 


Virus Amino acid nb Amino acid sequence 
BECV-F15 118 to 134 QLLPRWYFYYLGTGPHA 
JHMV 121 to 135 QLLPRWYFYYLGTGP 
GETV 89 to 101 RW FYYLGTGPHA 
IBV 91 to 102 WYFYY GTGP A 


This sequence has no peculiar properties: 9 hydrophilic and 8 hydrophobic 
residues. The biological significance of these findings is not known. 


In conclusion, we have noticed that there are only minor changes between 
BECV-F15 and BCV Mebus strain N proteins. Work is in progress to sequence 
the other virus genes and to find out how similar in fact these two last viruses 
are. Because of the antigenic differences established by monoclonal antibody 
screening, the specificities should be found on the gene coding for the spike 
gp105 protein. 
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K.P UR K ROS PON OK Cc T HV Cc F G K RG PON 
UAkccEcceCcRGARGACGACCCECAAUARACAAUGCACUGUUCAGCAGUCUUDUGEGARGACAGECCCCAAU 


NF G GG EM L K OL G T DP Fo P IL AE OL 
cAcaAuubucducuccacAaAUcUUAAAACUUGCAACUACUGACCCACAGUUCCECAUUCUUGCAGAACUCG 


A P T AG AF F F Go S$ R. LE L A RK OV re NL § GON OL 
CACCC AC AGCUGGUGCGUUUUUCUU UGG AUCAAGAUUAG AGUUGGCGAAAG UGC AGAAUUUGUC UGGGAAUCU 
DEP 2 KD Voy E LR Y N G A I R F D § T UL S$ G 
UGAUGAGCCCCAGAAGGAUGUUUAUGAAUUACGCUACAAUGGCGCAAUU AGAUUUGAU AGU AC ACUUUCAGGU 
FE TI )M KV LO N EON LN A OY nr 9 i DG M M N M 
UUUGAG ACC AUAAUGAAGG UGUUGAAUG AGAAUUUGAAUGCAUAUC AACAACAAG AUG GU AUGAUGAAUAUGA 


§. HP K OP R RG KN G G EN DN I § Vi AAP 
EuctaakactackcceucAGCeUGeUCAGAAGAAUGCACAAGCAGAAAAUGAUAAU ADAACUGUUGCAGCGCE 


K $§ RV Q Q N kK S R-_ EL T AE D I S$ LoL KOK M OD 
CAAAAGCCGUGUGCAGCAAAAUAAG AGU AG AG AGUUGACUGCAG AGG ACAUC AGCCUUCUUAAGAAGAUGGAU 


GRGCECUAUACUGAAGRCACCUC AG RAAUAUAAGAGAAUG AACCUU AUGUCGGUACCUGGUGGCAACCCCUCG 
CAGGAAAGUCGGGAUAAGGCAUUCUCUAUCAGAAUGGAUG UCUUGCUGCU AUAAU AG AU AGAGAAGGUUAUAG 
CAGACUAU AGAUU AAUUAGUUGAAAG UUUUG UGUGGUAAUG UAUAG UGUUGGAG AAAG UGAAAGACUUGCGGA 
AGUAAUUGCCC ACAAGUGCCCAAGGGGAAG AGCCAGCAUGUUAAGUUGCCACCC AGUAAUU AGUAAAUGAAUG 


AAGUUAAUUAUGGCC AAUUGGAAG AAUC ACA 


Fic. 4. — Amino acid sequence of the proteins predicted from the main and secondary ORF. 
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RESUME 


SEQUENCE ET ANALYSE DU GENOME 
DU CORONAVIRUS ENTERITIQUE BOVIN (F15) 


I. — Séquence du géne codant pour la protéine nucléocapsidique; 
analyse de la protéine déduite 


Nous avons cloné ARN génomique du coronavirus entéritique bovin F15 
(BECV-F15), dans le plasmide PBR322 aprés avoir préparé le cDNA corres- 
pondant a l’aide d’une amorce oligo-dT : 265 clones ont été étudiés. Leur hybri- 
dation avec les ARN poly(A)+ extraits des cellules infectées nous a permis 
de les localiser a l’extrémité 3’-terminale du génome. 


Ces clones ont été séquencés par la technique de Sanger, aprés sous-clonage 
dans l7ADN du phage M13. Nous avons déterminé une séquence de 
1.710 nucléotides correspondant au géne codant pour la protéine N virale. 
Elle présente deux cadres ouverts de lecture (ORF) chevauchants. On observe 
a Pextrémité 3’-terminale non codante du génome une séquence de 8 nucléo- 
tides observée également dans la région homologue des virus MHV, GET et 
IBV. Cette séquence pourrait étre le site de fixation de l’ARN polymérase. 


Le premier AUG du plus petit ORF posséde en amont une séquence nucléo- 
tidique qui en fait un site d’initiation potentiellement fonctionnel. La séquence 
du produit primaire de traduction que |’on en déduit est un polypeptide de 
207 acides aminés (22,9 Kd) a haute teneur en leucine (19,8 %) ayant une extré- 
mité N-terminale hydrophobe. 


Le plus grand ORF a une capacité de codage de 448 acides aminés (49,4 Kd), 
correspondant a la masse moléculaire de la protéine N. La protéine déduite 
contient 43 résidus sérine (9,6 % des acides aminés), qui peuvent étre phos- 
phorylés et impliqués dans la liaison entre la protéine N et l’ ARN génomi- 
que. Cette protéine présente également 5 régions fortement basiques, et l’une 
d’entre elles est également riche en sérine et a une forte homologie de séquence 
avec la région homologue des protéines N des virus MHV, GET et IBV. En 
outre, la premiére partie de l’extrémité N-terminale montre un enchainement 
de 12 acides aminés (PRWYFY YLGTGP) trés conservé entre ces quatre méme 
virus. 


Fic. 5. — Nucleotide sequence homology between the 3’-end of the genomes of BECV-FI5 
and MHV-JHM viruses. 
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1 AUCUCAGCGAUU GCGUGCGU GCAUCCCCUUCACU GAUCUCUU GUUACAUCUUUUUAUAAUCU 
Il If] LI III tl 1 J 11 11 III 1 ELIMI Ill | 1 I LIILI 
UAUAAG AG UG AUUGGCCUCCCUACGUACCCUCUCUACUCUAAAAC UCUUGUAG UUUAAAUC UAAUCUAAUCU 


63 AAACUUUAAGGAUGUCUUUUACUCCU GGUAAG CAAU CCAGUAGUAGAGCGUCCUCUGGAAAUCCUUCUG 
TETILITILILITIIIZIIN LIT LY 112 It Il 111i lil ILIZILIZIII AI II 
73° AAACUUUAAGGAUG UC UUUUGUUCCUGGGC AAG AAAAUGCCGG UAGCAGAAGCUCCUCUGGAAACCGCCCUG 


132. GUAAUGGCAUCCUUAAG UGGGCCGCAUCACUCCGACCAAUCUAGAAAUGUUCAAACCAGGGGUA 
LIQIIII LILI 111 IIIII If II lil 1 III INI Ima 
145 GUAAUGGAAUCCUCAAG AAG ACCACUUGGCCUG ACCAAACCGAGCGCCGGUUAAAUAAUCAAAAUAGAGGCA 


195 GAACAGCUCAACCCAAGCAAACUGCUACUUC UCAGC AACC AUC AGG AGGGAAUGUUGUACCCUACUAUUCUU 
Ill LIT TILIIIIT LLIIY T2I 11 LIIL1 IIIT III 21 LIT I 12 III 
217 GAAAGAAUCAGCCCAAGCAGACUGCAACUACU §=— CAACCCAAUUCCGGGAG UG UGG UUCCCCAUUACUCUU, 


267 = GGUUCUCUGGAAUUACUC AG UUUCAAAAGGGAAAGGAG UUUG AAUUUGCUG AGGGACAAGGUGUGCCUAUUG 
IMUT IZ IT TTLTIT IY 11 IL TLILLLIIL LIIITI TY IIIT 1 LILI LILIIIIIT 
286 GGUUUUCCCGCAUUACCCAAUUCCAGAAGGG AAAAGAGUUUC AGUUUGCAC AAGGACAAGGAGUGCCUAUUG 


339 — CACCAGGAGUCCCAGCUACUG AAGCUAAGGGGUACUGGUAC AGAC ACAACAGACGUUCUUUUAAAACACGCG 
I IIL IJ11III1I1 I 1 TLLLI IL LITLILILILIIQIL ILQIILY LIQL1IIIII =I 
358 CCAAUGGAAUCCCAGCUUCAC AGCAAAAGGGAU AUUGGUACAGACACAACCGACGUUCCUUUAAAACACCUG 


41) AUGGCAACCAGCGUCAAUUGCUGCCACGAUGGUAUUUUUACUAUCUUGGAACAGG ACCGCAUGCCAAAGACC 
ILIII 1 IIL IX__T ITIL LILILIILIIILILIIIIIIIIIIIIIL Il Il I 
430 —AUCCCCAGCAGAAGCAGCU ACUGCCCAGAUGG UAUUUUUACUAUCUUGG AACAGGGCCCUAUGCUGGCCCAG 


483  AGUAUGGCACCGACAUUG ACGGAGUCUUCUGGG UCGCUAG UAACCAGGCUG AUGUCAAUACCCCGCCUGACA 
TIQVIIIIG LLL LL QL LLTILT ILTIILILI IX 1 L111 i I Il iirirt 
$02 AGUAUGGCGACGAUAUCGAAGC AGUUG UCUGGG UCGC AAGCCAACAGCCCGAGACUAGGACCUCUGCCGAUA 


555 YUCUCGAUCGGGACCCAAGUAGCCAUG AGGCUAUUCCGACUAGGUUUCCGCCUGGCACCGUACUCCCUCAGG 
IL IT LL LLLLLILILILILIT LLLLILILIZIL2 LLLILILIZ LIII Il LILTII 1 LULL t 
574 UUGUUGAAAGGGACCCAAGUAGCCAUG AGGCUAUUCCUACUAGGUUUGCCCCCGGUACCGUAUUGCCUCAAG 


627 eee hea erat Teter ei tek tile abe aaeaaic eae ee ya eae 
III TIX LILLILILIIIIILILIIIIZI Ltr 1 Ill 11 ILI1 WI 1 1 
646 GUUUUUAUGUUCAAGGCUGAGGAAGCUCUGCACCUGCUACUCGAUCUGGUD CCCG GCGACAAUCCEGU 


699 CUAGUGCAGGAUCGCGUAGUAG berate ie ear Ear 6 UGUAAC ACCUGAUAUGG 
I IJILIL ILIr Lr Lit I I1I LILI LIL ILI 
714 GCCCCAAAUAAUCECE CUAGAACCAGUUCCAACCAGCECCAGCCUGECUC UACUGUAAAACCUGAUAUGG 


771) — CUGAUCAAAUUGUCAGUCUUG UUUUGGCAAAACUUGGCAAGGAUGCCACUAAGCCACACCAAGUAACUAAGC 
LIL IIx IIQLILILILIL Il 11 11 IX IIIT ITIZ LILLLIIII It 
784 = CCGAAGAAAUUGCUGCUCUUGUUUUGGCUAAGCUCGCUAAAG AUG CCGGCCAGCCUAACCAAGUAACAAAGCC 


B43 AGACUGCCAAAGAAAUCAGACAGAAAAUUUUG AAUAAGCCCCGCCAGAAGAGG AGCCCCAAUAAACAAUGCA 
IT UILIVILAIL LIL LULULLLILI IL TILIL U1 UL QLTTLIT) Il (1 Il 11 Til 
B56 AAAGUGCCAAAGAAGUCAGGCAGAAAAUUUUAAACAAGCCUCGUCAAAAGAGGACUCCAAACAACCAGUGCC 


915 — CUCUUCAGCAGUGUUUUGGG AAG AGAGGCCCCAAUCAGAAUUUUGGUGGUGCAG AAAUGUUAAAACUUGGAA 
IL LLLLELLILILIJILILIIIIIIILELILILZILIIIIIIII 11 LIILILLIILIIIILIIIL 
928 CAGUCCAGCAGUGUUUUGGAAAG AG AGGCCCCAAUCAG AAUUUUGC AGGCCCUGAAAUGUUAAAACUUGGAA 


987 CUAGUGACCCACAGUUCCCCAUUCUUGC AG AACUCGCACCCACAGCUGGUGCGUUUUUCUUUGGAUCAAGAU 
ILIQILY LLLILYTLILLILIYIIILILITT =f 11 LL LYLLLLILIYY IL LLULLIIIIIIII I 
1000 =CUAGUCAUCCACAGUUCCCC AUUCUUGCAG AG UUGGCCCCAACAGCUGGUGCCUUCUUCUUUGGAUCUAAAU 


1059 UAGAGUUGGCCAAAGUGCAG AAUUUGUCUGGGAAUCUUG AUG AGCCCCAG AAGGAUGUUUAUG AAUUACGCU 
AIT III IIL IlIl III I Ifill iit IZ ITIII (IIL II 1 
1073 UAGAAUUGGUCAAA = =6AAGAA = CUCUG GUGGUGCUG AUGGACCCACCAAAG AUG UGUAUGAGCUGCAAU 


1131 ACAAUGCCGCAAUUAGAUUUC AU AGUACACUUUC AGGUUUUG AGACCAUAAUGAAGC UGUUGAAUGAGAAUU 
I Il IIIT ELILILISITILIIZT 11 1 LILILIQILIL 11 ULIUL LILAIIILIVIIIIII 
1138 AUUCAGCUCCAGUUACAUUUGAUAGUACUCUACCUGGUUUUG AGACUAUCAUGAAAG UGUUG AAUGAG AAUU 


1203 UGAAUGCAUAUCAACAACAAGAUGGU AUGAUCA AUA UGAGUCCAAAACCACAGCCUCAGCGUGGUCAGA 
LILLLIY IL IL I LEYTLILIL Il 1k IL WILT Il Ik IT] LIT 1 IW 1 tl I 
1230 UG AAUGCCUACCAG AAUCAAG AUGG UGG UGCAG AUG UAG UG AGCCCUAAGCCUCAG AGAAAG AGAGGGACAA 


1272, AG = AAUGG ACAAGGA GAAAAUGAUAAUAUAAGUGUUCC AGCGCCCAAAAGCCGUGUGCAGCAAAAUA 
Il_ Irit IT Il It IIL LILEYI LILI LITIZY LELIYLILII LILI 111 
}282 AGCAAAAGGCUCAGAAAG AUG AAGUAG AUAAUG UAAGCGUUGCAAAGCCCAAAAGCUCUGUGCAGCGAAAUG 


1338 AGAGUAGAGAGUUGACUGCAGAGGACAUCAGCCUUCU UAAGA AGAUG GAUGAGCCCUAU 
TIQILIIIIIT Il 1 II11 LIIIIII1 I 1L1 T1111 I I1Il I! 
1354 UAAGUAG AG AGUUAACCCCUG AGG AUCGCAGCCUUCUGGCUC AGAUCCUAG AUG AUGGCGUAGUGCCACAUG 


1397 ACU GAAGACACCUCAGAAAUAU AAGAGAAUGAACCUUAUGUCGGCC ACCUGGUGCCAACCCCUCCCAGGA 
1 YILIT IYI 2 IT LLQIIILILIT LY LLLLILILIIL TLLII IIL =I 
1426 GGUUAGAAGAUGACUCUAAUGUGUAAAGAG AAUGAAUCCUAUGUCGGCAC UCGGUGGUAACCCCUCGCGAGA 


1466 AAGUCGGGAUAAGGCACUCUCUAUCAG AAUGGAUGUC UUGCUGCUAUAAUAGAUAG AGAAGCUUAUAGCAGA 
TITELILILIY L LELILILIILILIQILILILIZILIIIIY LUT LYTXILIIIILLII I TIT 
1498 AAGUCGGGAUAGGACACUCUCUAUCAG AAUGGAUG UC UUCCUGUCAUAAC AGAVAGACAAGGUUGUGGCAGA 


1538 CUAUAGAUUAAUUAGUUGAAAG UUUUGUGUGGUAAUGUAUAGUGUUCGACAAAGU G AAACACUUGCG 
ie See 2 9 9 99959000595) Ill Il TUL Ul LI Limiitrtti il 
1570 CCCUGUAUCAAUUAGUUGAAAGAGAUUGCAAAAUAGAGAAU GUGUGAGAGAAGUU ACC AAGGUCCUACGUC 


1605 GAA GUA ees hs oe oa he Wet bea ECE ba CAGE AUGUURACTUGECACCCACUAAUD AG 
II Il I 1 I 111it VILILIITIL Il 11-1 if poe 9 Ge ae | 111 
1641 UAACCADAAG AACCGCGAUAGCCECCCCCUGGCAAG AGCUCACAUCAGGGUACUAUUCCUGCAAUGCCCUAG 


1670 UAAAUGAAUG AAGUUAAUUAUGGCCAAUUGGAAGAAUCACA 
TLILELELIIYIIIL =f PLLELILYILILIILIIIIIL 
1723 UAAAUGAAUGAAGUUGAUC AUGGCCAAUUGGAAGAAUCAC 
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x 
* 
*« 


MSFTPGRKQSSS 


MSFVPGQENAGSRSSSGNRAGNGILKKTTWADQTER 


1 


I R 
x * 
QFQKGKEFQFAQGQGVPTITANGIPASQQKGYWYRHNR 


74 


RA 
* 
RARSS 


* 


az * 


ASSRASSAGSRS 
LPQGFYVEGSGRSAPASRSGSRPQSRGPNN 


mK 


185 


AP 
zk * 
AP 


Tv 
* 
QNFGGPEMLKLGTSDPQFPILAEL 


293 
294 


AKVQNLS 
VKKNS 


EL 
x * 
EL 


330 
331 


ETIMKVLNENLNAYQNQDGGADVVSPKPQRKRGTKQK 
AQKDEVDNVSVAKPKSSVQRNSVRELTPEDRSLLAQI 
Fic. 6. — Amino acid sequence homology of the N proteins of BECV-FI5 


LODGVVPDGL 


367 
366 
403 
440 


and JHM-MHYV viruses. 
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Les séquences des protéines N de la souche Mebus du BCV et du BECV- 
F15 ne présentent que des différences mineures. 


Mots-cLEs: Coronavirus, Protéine, Nucléocapside, Génome; Souche 
BECV-F15, Séquence de la protéine N. 
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